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Abstract 

In a seminal paper [Q Caspar and Klug established a theory that provides a family 
of polyhedra as blueprints for the structural organisation of viral capsids. In particular, 
they encode the locations of the proteins in the shells that encapsulate, and hence provide 
protection for, the viral genome. Despite of its huge success and numerous applications 
in virology experimental results have provided evidence for the fact that the theory is 
too restrictive to describe all known viruses PJ. Especially, the family of Papovaviridae, 
which contains cancer causing viruses, falls out of the scope of this theory 

In we have shown that certain members of the family of Papovaviridae can be 
described via tilings. In this paper, we develop a comprehensive mathematical framework 
for the derivation of all surface structures of viral particles in this family. We show that 
this formalism fixes the structure and relative sizes of all particles collectively so that there 
exists only one scaling factor that relates the sizes of all particles with their biological 
counterparts. 

The series of polyhedra derived here complements the Caspar-Klug family of polyhe- 
dra. It is the first mathematical result that provides a common organisational principle 
for different types of viral particles in the family of Papovaviridae and paves the way for 
an understanding of Papovaviridae polymorphism. Moreover, it provides crucial input 
for the construction of assembly models along the lines of 0] |S] . 



1 Introduction 

Icosahedral symmetry plays a fundamental role for the structure of viruses because it con- 
straints the organisation of the proteins in the viral capsids that protect the viral genome. 
Based on this observation, Caspar and Klug have developed a landmark theory in which 
they derive a family of icosadeltahedra, i.e. polyhedra with icosahedral symmetry and faces 
given by equilateral triangles, that act as blueprints for the organisation of viral capsids. In 
particular, the corners of the triangular faces of these polyhedra encode the locations of the 
protein subunits in the capsids. This implies that proteins are organised in clusters of 5 or 6 
protein subunits, called pentamers and hexamers, respectively. The polyhedra consist of 20T 
triangular facets, where T = h? + hk + k 2 with h € N U {0}, k £ N denotes the triangulation 

1 E-mail: tk506@york.ac.uk 
2 E-mail: rt507@york.ac.uk 



1 



number that parameterises the family of polyhedra. As a consequence, there are precisely 
12 pentamers and 10(T — 1) hexamers in a viral capsid corresponding to the polyhedron of 
triangulation number T in this family, and the corresponding capsid contains 60T protein 
subunits. 

The family of polyhedra established in Caspar-Klug Theory is fundamental in virology 
and has a plethora of applications ranging from image analysis of experimental data to the 
construction of assembly models. Despite its huge success experimental evidence has shown 
that it is too restrictive to account for all known viruses. In particular, viruses in the family 
of Papovaviridae fall out of the scope of Caspar-Klug Theory and their organisation has 
therefore been a long-stand open problem in virology [HI El ; which has been formulated by 
Liddington et al. in 1991 as follows: "The puzzle is how do the coloured pentamers fit into 
the hexavalent holes?". In particular, the experiments show that also the protein clusters 
located off the global 5-fold axes of icosahedral symmetry can be pentamers, which is by 
construction excluded by the approach of Caspar and Klug because they are working with a 
hexagonal lattice. 

We have provided a solution to this puzzle based on tiling theory in [3J |Sj, where we 
have constructed a polyhedron that describes the surface structure of the viral particles 
observed in [HJ|7]. By construction, these models not only encode the locations of the protein 
subunits but also those of the inter-subunit bonds that connect different pentamers in the 
capsid. These results have provided the basis for the construction of assembly models for 
Papovaviridae in [3J Oa] , and the tiling approach has moreover paved the way for the study of 
crosslinking structures (Hi- 
lt is the purpose of this work to develop a mathematical framework for the systematic 
construction of polyhedra associated with viral capsids composed of pentamers throughout, 
and hence to derive a family of polyhedra that constitute the exceptional cases needed to 
complement the Caspar-Klug series of polyhedra for the description of viruses. 

Since we are seeking polyhedra that encode the locations of pentamers off the 5-fold axes 
of the icosahedral group, it is not possible to work with a hexagonal lattice as in the case 
of the Caspar-Klug construction, and a completely different mathematical approach is hence 
required. In particular, the coordinates of the vertices have to be retrieved from an icosahedral 
lattice, i.e. a lattice that is invariant under the action of the icosahedral group. Such a 
lattice does not exit in three dimensions, but can be inferred from a six-dimensional space 
via projection. This procedure is known from the study of aperiodic structures |lf)j . which 
describe the locations of atoms in alloys called quasicrystals . An additional complication 
arises due to the fact that such projections only lead to discrete point sets if the number of 
lattice points projected from the higher-dimensional space is restricted in an appropriate way. 
We use here a restriction that is rooted in the structure of the icosahedral group itself, and 
can be obtained via a so-called affine extension of that group. It can be used to construct a 
family of finite dimensional point sets that act as blueprints for the coordinates of the desired 
polyhedra. 

We obtain three types of polyhedra in this way that correspond to the three species of 
particles observed in the family of Papovaviridae. The ratios of their radii are completely 
determined by the mathematical formalism, so that there exists only one scaling parameter 
that relates the geometries of all polyhedra collectively to the biological setting. It can be 
used to test the predictions of our theory as we discuss in section H3 
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The smallest polyhedron in the series is a triacontahedron, and we therefore call our series 
of exceptional polyhedra the triacontahedral series. The 30 faces of the triacontahedron are 
rhombs. By subdividing each rhomb into two triangles, it can be seen that the triaconta- 
hedron is structurally similar to the T = 3 capsid in Caspar-Klug Theory. However, the 
coordinates of the vertices are of a different type: while the coordinates of the triacontahe- 
dron can be obtained from a higher-dimensional icosahedral lattice via projection, this is not 
the case for the T = 3 Caspar-Klug structure. Moreover, the vertices of the polyhedra in 
the triacontahedral series are located on nested spherical shells rather than a single spherical 
shell, so that these particles are only approximately spherical. For example, the triacontahe- 
dron has 12 vertices (corresponding to an icosahedron) located on one shell and 20 vertices 
(corresponding to a dodecahedron) located on a different shell within. 

The polyhedra corresponding to the medium and large sized particles have vertices on 
three different nested shells. The vertices at which 5 faces meet mark the centres of the 
pentamers, and the locations of the protein subunits correspond to the angles of the corre- 
sponding faces. In combination with the tiling approach citeTwarock:2004a,Twarock:2005a 
it is moreover possible to deduce the bonding structure of the viral capsids they represent as 
we discuss in section El 

Our mathematical set-up moreover provides a framework for the study of the scaling 
transformations and rotations that map the coordinates of the polyhedra on other points 
induced from the higher-dimensional icosahedral lattice by projection. This is important be- 
cause viral capsids are three- rather than two-dimensional objects, and these transformations 
can be used to associate a three-dimensional structure with the two-dimensional surfaces of 
the polyhedron that provides the blueprint for the capsid. This has been demonstrated in 
|12j for the case of Human Rhinovirus. The viral capsid of this virus follows the blueprint of 
our small shell, the triacontahedron. It has been shown in this reference that the viral capsid 
is contained between two copies of the triacontahedron that are related by a scaling by the 
irrational number r := ^(1 + \/5), which corresponds to a transformation of the type we are 
considering here. In section El we show how these transformations can be derived within our 
formalism for all viruses in the triacontahedral series. This paves the way for a construction 
of three-dimensional structures associated with the blueprints provided by the polyhedra for 
all viruses in this series, and hence to derive the locations also of the proteins that are located 
within the capsid and delimit the cavity filled by the genetic material. Such information is 
invaluable for example for the study of scaffold mediated assembly. 

The paper is organised as follows. In section [21 we outline the projection formalism 
that connects a generalised lattice (or quasi-lattice) based on the simple root vectors of 
the non-crystallographic Coxeter group H% with the root lattice of Dq. In section 01 we 
derive finite dimensional subsets of this generalised lattice based on an affine extension of the 
non-crystallographic Coxeter group #3 . In section |1] we use these sets for the construction 
of the triacontahedral series of polyhedra. The projection picture is extended in section 
El and a formalism is established that allows us to determine the range of possible scaling 
transformations, rotations and combinations thereof that map the vertices of the polyhedra 
on other points in the generalised lattice obtained via projection, because this provides the 
basis for the construction of three-dimensional models associated with these polyhedra along 
the lines of ^2]. A comparison with experimental results is provided in section H3 Finally, 
we conclude with a discussion of the mathematical and biological implications of our results. 
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2 Points sets induced from an icosahedral lattices by projec- 
tion 



We are seeking polyhedra that are symmetric under the icosahedral group. Since the icosa- 
hedral group does not stabilize a lattice in three dimensions, a 3-dimensional generalised 
lattice or quasi-lattice has to be inferred from a higher dimensional crystallographic lattice 
via projection. 

We use the fact that the icosahedral group is crystallographic in 6 dimensions, and con- 
struct our quasi-lattice from the root lattice of Dq via projection. We remark that it would 
also have been possible to construct a quasi-lattice in three dimensions via a projection from 
Z 6 rather than Dq as in ^2|- However, our approach is more general, because the root lattice 
of Dq allows for an embedding with maximal symmetry |13j . 

Let e = {ej\j = 1, . . . , 6} denote the standard basis in 6 dimensions with (ej, ej) = Sij, 
i, j = 1, . . . , 6. The root lattice of Dq corresponds to the Z-linear span of the 6 simple root 
vectors (or roots in short) of the Coxeter group Dq. They can be expressed in terms of the 
standard basis e as follows: 

a\ = e2 — e\ a 2 = e\ — e 3 

03 = e 3 - e 6 o 4 = e 5 + e 6 (1) 

a 5 = -e 4 - e 5 a 6 = e 4 - e 5 

We choose a projection to 3 dimensions that maps the basis e on the vectors pointing to 
the six non-aligned vertices of an icosahedron. The coordinates of these vectors are given in 
terms of the 3 dimensional standard basis below, using r = i (1 + \/5) : 

ei i ^ |(1,0, r) e 2 i-> ^(r, 1,0) 

e 3 » 1(0, r,l) e 4 » |(-l,0,r) (2) 

e 5 ^ |(0, -r,l) e 6 ^ |(t,-1,0). 

Hence, by virtue of Q and ©, the vectors dj, j = 1, ... ,6, project on the vectors in 
©. Coordinates are again expressed in terms of the standard basis in 3 dimensions, with 
the notation r' = i(l — \/5) for the Galois conjugate of r and the identity r + r' = 1: 

ai i y Si = |(— t', 1, — r) a 2 h-> 02 = |(1, — r, — r') 

a 3 i ^ ro 3 = §(-r, -r 2 , 1) a 4 (->• ro2 = i(r, -r 2 , 1) (3) 

a 5 i ^ rax = |(1, -r,-r 2 ) a 6 h-> a 3 = ^(-1, r, -r') . 

ai, a~2 and a% correspond to the root vectors of the non-crystallographic Coxeter group H3. 

The projection vr 1 1 in j2J) is illustrated in terms of the Dynkin diagrams of Dq and if 3 
in Fig. ^ On the left, the Dynkin diagram of Dq is shown, in a folded and hence slightly 
unconventional form, to illustrate how the simple root vectors of Dq project on the root 
vectors of H%. In particular, the 5 above the link in the Coxeter diagram of H3 on the right 
indicates that the angle between the corresponding root vectors is ir — ^, with the 3 over the 
other link omitted by convention. We remark that this procedure is similar to the projection 
from on H4 considered in |14j . 

The root lattice of Dq is given by Z-linear combinations of the simple roots a,j, j = 1, . . . , 6. 
Correspondingly, its projection into M 3 via mi is given by Z[r]-linear combinations of the 
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Figure 1: Figure illustrating the projection tt\\ of the simple roots of Dq on those of H3. 

simple root vectors Sj, j = 1, . . . , 3, of H3, where Z[r] denotes the extended ring of integers 

Z[t] := {a + rb\a,b eZ} . (4) 

Since Z[t] is dense in IR, the Z[r]-linear combinations of the simple root vectors of H3 are 
dense in ]R 3 . Thus it is necessary to select subsets thereof which are suitable to guide the 
search for the vertex sets of the polyhedra. Such sets will be derived in the following section 
based on an affine extension of H3. 



3 Nested shell structures via an affine extension of #3 

The simple root vectors a*, j = 1, ... ,3, in (J2J) form a basis of the root system of if 3 |15j . 
The complete root system A is given by 

^ j (±1,0,0) and all permutations 

I |(±l,±r, ±r') and all even permutations 

The root vectors in A point to the vertices of an icosidodecahedron. They encode the 
generators of the Coxeter group H% as follows. Let a £ A and let (.|.) denote the scalar 
product in R 3 . Then 

2(x\a) o 
r a :x^ x - ; / for i£l 6) 
(a\a) 

is the Euclidean reflection in the plane orthogonal to the root vector a. The icosidodecahe- 
dron representing the root system A and two of the reflection planes that are encoded by 
the root vectors are illustrated in Fig. |^1 The intersection of the two planes in the figure 
corresponds to an axis of 5-fold symmetry, which intersect with the sphere at two of the 12 
5-fold vertices of a (spherical) icosahedron. The other reflection planes are not shown, but 
their intersections with the surface of the sphere are indicated as geodesies (i.e. as spherical 
arcs obtained as intersections of planes through the centre of the sphere with the surface of 
the sphere). The intersections of the geodesies mark the locations of all symmetry axes, and 
one can hence reconstruct the locations of all 6 five-fold, 15 two-fold and 10 three-fold axes 
of rotational symmetry of the icosahedral group. 

Note that in contrast to the case of Weyl groups 2 ^\a) e ( ms t ea d of Z) for all a, 
/3 £ A. Therefore, Z-linear combinations of root vectors in A do not form a crystallographic 
lattice, and A is therefore called non-crystallographic. However, it is possible to select subsets 
which form generalised lattices such as the quasi-lattices known from quasicrystals jlOj . 

In order to extend the root system in a way compatible with overall icosahedral symmetry 
the basis of simple root vectors has to be extended. For this, we use the fact that the relations 
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Figure 2: The root polytope (left) encoding the locations of the planes of reflection (right). 



between the simple root vectors are encoded in the Cartan matrix C, 
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C:= l 2 ^j a ^ ) =|_l 2 -t\ . 



C 



(8) 



( a j\ a j) / tJ 

We extend this matrix by an additional row and column via a formalism known as affine 
extensions in the theory of Kac-Moody algebras. The only difference here is that entries 
stem from the set Z[t] rather than Z. We have shown in that the affine extended Cartan 
matrix of H3 is given by 

(2 t' \ 
2-10 
t 1 -1 2 -r 
\0 — t 2 / 

The additional row and column encode a further root vector corresponding to an affine 
reflection that acts as a translation T by the highest root vector an = toi\ + 2r«2 + t 2 «3. 
The three other reflections are cyclic operations of order two and the products of any two of 
them correspond to rotations around the origin as follows: 



( rj r k ) M = 1 where < _ ~ ~ _ \ (9) 



M = 


1 


if 


Cjk 


= 2 


M = 
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if 


Cjk 


= 


M = 


3 


if 


Cjk 


= -1 


M = 


5 


if 


Cjk 


= — T 



The affine extended group is hence generated by the three reflections n, r2, r% and the 
translation T. 

It is now possible to construct, based on these operations, finite point sets with icosahedral 
symmetry such that all points in the sets have a counterpart in the root lattice of D§. In 
this sense, they are finite subsets of generalised lattices induced by projection from Dq and 
correspond to potential candidates for the locations of the polyhedra we are seeking. 

The point sets are generated iteratively via the action of the generators of the extended 
group on the origin 0. If the action of the translation operator T is not restricted, one obtains 
an infinite point sets that densely fills M 3 . However, if T acts only a finite number of times, N 
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say, while the action of all other operations is not restricted in order to guarantee icosahedral 
symmetry of the overall configuration, finite dimensional point sets S(N) are obtained which 
become larger and more dense with increasing N. 

Due to the fact that T acts as a translation by the highest root vector it has been possible 
to derive a simple expression for the point sets S(N) in terms of the root system A [T8] : 

In particular, S{N) consists of all NU {0}-linear combinations of up to N root vectors in A. 
N is called the cut-off parameter because it limits the number of points in the set. 

The sets S{N) play a crucial role in the following for the construction of the polyhedra. 



4 Construction of the triacontahedral series of polyhedra from 
ii/3-induced shell models 

By construction, the sets S{N) contain the vertices of the polyhedra in the triacontahedral 
series. Due to the fact that we are looking for polyhedra that represent viral capsids given 
in terms of pentamers that are (with a certain experimental error) equidistant from their 
neighbouring pentamers, the vertices of the desired polyhedra follow (in a good approxima- 
tion) the vertices of the Platonic and Archimedean solids, because they correspond to the 
polyhedra with equidistant edge lengths. They can hence be used as templates for the search 
of the coordinates of the polyhedra within the sets S(N). Moreover, since we are looking for 
polyhedra that mark the locations of pentamers, only those Platonic and Archimedean solids 
are relevant that have 5-coordinated vertices, i.e. vertices at which 5 edges meet. The three 
solids that fulfil this criterion are the icosahedron, the snub cube and the snub dodecahedron. 

We have computed the sets S(N) up to N = 5 explicitly. The points are organised 
on 181 nested shells of radii between R ~ 0.2361 and R = 5. 5(5) contains all point sets 
S(N) with N < 5 by construction, and has been chosen because it is sufficiently dense to 
provide coordinates for the polyhedra. We have used the icosahedron, the snub cube and the 
snub dodecahedron, respectively, as a template to search for subsets of S(5) with the desired 
distribution of points. We have thus obtained those coordinates of our polyhedra, that 
mark the locations of 5-coordinated vertices. We have then determined further vertices (not 
necessarily on a shell of the same radius) which together with these vertices from the vertex 
set of the polyhedron. The procedure is discussed explicitly in the following subsections for 
the individual cases and the coordinates of the vertices corresponding to centres of pentamers 
are provided in the Appendix. 



4.1 The small species 

The vertex set of the icosahedron occurs for the first time in the set S(3) on a shell of radius 
R ~ 1.1756. There are two ways of obtaining polyhedra that match this vertex set. One of 
them is the icosahedron itself and corresponds to the start of the Caspar-Klug series. The 
other one is the rhombic triacontahedron with 30 rhombic faces. The remaining vertices of 
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this polyhedron correspond to the vertices of a dodecahedron located on shell of a radius 
R ~ 1.0705 (i.e. there is a scaling factor of about 1.098 between the shells that contain the 
two different types of vertices). The triacontahedron is shown in Fig. 01 




Figure 3: The triacontahedron corresponding to the polyhedron of the small species. 

Since only those vertices at which 5 faces meet mark the locations of pentamers, both 
cases correspond to capsids with 12 pentamers located on the axes of 5-fold symmetry of 
the icosahedral group. However, only the centres of the pentamers coincide in both cases, 
and the orientations of the pentamers are different: because the proteins are located in the 
corners of the faces (shown schematically as dots in Fig. 0J) they differ by a rotation by an 
angle of as illustrated in this figure. It shows a face of the icosahedron with three protein 
subunits on the left versus three faces of the triacontahedron on the right with 2 protein 
subunits each. A superposition of both figures shows that the respective proteins are rotated 
with respect to each other. 




Figure 4: The locations of proteins on the triangular faces of the icosahedron (left) are rotated 
by an angle of with respect to the locations of the proteins on the rhombs (right). 

This difference has far-reaching consequences for the structure of the capsids. In par- 
ticular, in one of the cases crosslinking structures are possible while they cannot occur for 
geometrical reasons in the other case, as has been demonstrated in [S]. 

The case of the triacontahedron is hence essentially different from the T = 1 case in the 
Caspar-Klug family and marks the start of a different series of polyhedra. 

4.2 The intermediate species 

Due to the fact that the snub cube has octahedral rather than icosahedral symmetry, we 
need to use the geometric inclusions illustrated in Fig.|S]in order to locate its vertices within 
the point set S(5). 
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Figure 5: A cube inscribed into a dodecahedron (left) and an octahedron into a cube (right). 

The points of S(5) that are closest to the vertex set of the snub cube occur on a shell of 
radius R ~ 2, 3199, which appears in S(4) for the first time (i.e. it is not contained in a set 
S(N) with N < 4). The edges are not equidistant in the case of the snub cube as expected, 
but the deviations are small. In particular, one obtains the following values for the three 
different distances A, B and C between neighbouring points in the set. 

Ak, 1,7481, B« 1,6625, (7 » 1,8783. (11) 

The point set is illustrated in Fig. (using the 3d-Grapher software) in relation to the 
locations of the 2-, 3- and 4- fold symmetry axes of the octahedral group. Moreover, the 
triangle Aabc formed from the distances A, B and C has been superimposed. 




Figure 6: The triangle Aabc hi the octahedral case. 

From the triangle Aabc the polyhedron is constructed via the following argument. The 
line denoted as C is centred on a global two- fold axis of the symmetry group. The face 
containing that edge is hence 2-fold symmetric about that axis. We compute the intersection 
of this axis with the edge C, and determine a line that goes through that intersection point, is 
perpendicular to C and intersects the sphere in two points equidistant from the intersection 
point. We then determine the intersection of this line with the line through the global 4- fold 
axis (marked as an encircled 4 in Fig. |BJ) and the centre of edge B. The intersection is located 
within the sector of the sphere that is characterised by the triangle Aabc an d determines 
one further type of vertex of the polyhedron that does not correspond to the centre of a 
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pentamer. A third type of vertex is located on the global 3-fold axis, that is indicated as an 
encircled 3 in Fig. There is flexibility in the construction in that the vertices on the global 
3- and 4-fold axes do not need to be located on a shell of the same radius as the vertices 
representing pentamers. However, due to the fact that they are not marking locations of 
pentamers, their exact location is not important for our purpose. It is important to note 
though that a variation of the radii of the shells on which they are located causes changes in 
the angles of the faces. 

One hence obtains the polyhedron illustrated schematically in Fig. The fundamental 
domain of the octahedral group corresponds to the triangle between the 2-fold, 3-fold, and 4- 
fold axes marked as encircled 2s, 3s and 4s in Fig. Eland the entire polyhedron can be obtained 
from it via the action of the octahedral group. Therefore, only a part of the polyhedron 
overlapping with the fundamental domain is shown, because the rest of the polyhedron is 
implied by symmetry. It is a schematic representation in that the vertices not marking the 
locations of the pentamers have been drawn on the same shell. However, this illustration 
shows all features of the polyhedron that are needed to read off the locations of the protein 
subunits, which are located in those corners of the faces that meet in multiples of 5 at a 
vertex. 



Figure 7: Schematic representation of the polyhedron corresponding to the medium species. 

It is a polyhedron in terms of rhomb- and kite-shaped faces. The coordinates of the 
vertices that mark the centres of the pentamers are given in the Appendix. We remark that 
the angles of the tiles around these vertices are not equal, but the deviations are small. In 
particular, the angles on the kites are larger than those on the rhombs. This fact may account 
for the experimental observation that viral particles corresponding to such a polyhedron have 
only been obtained in vitro, but have never been observed in vivo. 

4.3 The large species 

The construction of the polyhedron corresponding to the large species is analogous to the 
procedure used before. In this case, we use the vertex set of the snub dodecahedron as a 
template. The subset of S(5) closest to it is located on a shell of radius R ~ 3, 1247 and 
appears in S(5) for the first time. 
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Figure 8: The triangle A-abc m the octahedral case. 

The three different distances between neighbouring points are given in (|12[). 

Ak, 1,3708, Bra 1,7520, Cra 1,4364. (12) 

They define the triangle Aabc m Fig. |HJ We start again by fixing the rhomb on the 

2- fold axis containing C, as well as the angles of the kite-shaped faces around the 5-fold 
axes. This determines the vertex within the sector corresponding to the triangle A.abc as 
before. The radius of the shell that determines the locations of the global 5-fold vertices can 
be computed such that the angles on the kite-shaped faces at the vertices meeting rhombs 
match those of the rhombs containing C. Finally, the angles of the rhombs around the global 

3- fold axes can be adjusted by varying the radius of the shell on which they are located. In 
this way, the polyhedron in Fig. is obtained. As before, it is a schematic representation 
that encodes all important information about the polyhedron. The fundamental domain of 
the icosahedral group corresponds to the triangle between the 2-fold, 3-fold, and 5-fold axes 
marked as encircled 2s, 3s and 5s in Fig. |H1 and the entire polyhedron can be obtained from 
it via the action of the icosahedral group. It is hence sufficient to represent only part of 
the polyhedron, provided that this contains the fundamental domain, in order to specify the 
polyhedron completely. 

The coordinates of the vertices that mark the locations of the pentamers are given in the 
Appendix. They correspond to 12 vertices located on the 5-fold axes of icosahedral symmetry, 
as well as further 60 vertices off these axes. 

Finally, we remark that the set S(5) contains the vertices of all three types of polyhedra. 
Hence, the geometrical formalism fixes the sizes of the three different species of particles 
in relation to each other, so that there is indeed only one free parameter that relates the 
overall mathematical structure, i.e. the small, medium and large polyhedron collectively, to 
the biological setting. We determine the corresponding scaling factor in section |f)J 
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Figure 9: Schematic representation of the polyhedron corresponding to the large species. 

5 Scaling transformations and rotations 

In order to determine the range of possible scaling transformations, rotations and combina- 
tions thereof that map the vertices of the polyhedra onto other points in a set S(N) (possibly 
with different N) one needs to extend the projection picture presented in section |21 

5.1 The extended projection picture 

Under the action of the icosahedral group, M 6 decomposes into two irreducible orthogonal 
subspaces, which we denote as E± and Eu. This decomposition induces a projection of the 
root lattice of Dq onto two different copies of the root system of H3, which are located in E± 
and Eu, respectively. 

The projection of the simple roots of Dq onto En has been considered in Fig. Q hi section 
121 Analogously, we obtain a second projection, denoted as 7rj_, which acts as shown in Fig. fTTTl 



Oil «2 03 oi 02 r'o3 




0:5 «4 «6 r'ai t'o2 03 al 02 03 



Figure 10: Figure illustrating the projection tt± of the simple roots of Dq on those of #3. 

The coordinates of the vectors in E± are given in ()13j) . 

oi = ±(-t,-t',1), r'ai = ±(1, -r' 2 , -r'), 

a 2 = i(l,-r,-r'), r 'a 2 = , 1, -r /2 ), (13) 

a 3 = i(-l,-r,r'), r'ai = \(-r' , 1, r' 2 ). 

A comparison of (|13|) and © shows that the copies of ^3 in En and E± are related by 
an interchange of r and r 1 in the coordinates of the root vectors. Hence, each simple root 
vector fij, j = 1, . . . , 3, of the H^-copy in Eu has a counterpart a}, J = 1, . . . , 3, in one of the 
simple root vectors of the //3-copy in E± and vice versa, related by an exchange of r and r' 
in their coordinates. This applies also to the root system A in (JSJ). The root systems of the 
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two ii/3-copies in Eii and E± are shown superimposed onto each other in Fig.^2 as dark and 
light dots, respectively, to demonstrate the geometric implications of this relation. 




Figure 11: The root systems of the two H% copies on Eu and Ej 



As mentioned in section the root lattice of Dq is given by Z-linear combinations of the 
simple roots aj, j = 1, . . . , 6, and its projection tt^(Dq) onto £j| corresponds to the Z[r]-linear 
combinations of the simple root vectors aj, where Z[r] is the extended ring of integers defined 
in It is a dense set in M 3 due to the properties of Z[r]. Hence, a formalism is required 
to select a discrete subset of tt^(Dq). 

This is possible via the cut-and-project method known from the study of quasicrystal 
In particular, we define an automorphism * as a mapping from tt^(Dq) to tt±(Dq) that 
acts on a vector x G tt^(Dq) by exchanging r and r' in all coordinates. For any connected, 
bounded area $7 C E± we define point sets by 



= j x g 7T||(A 
The construction is illustrated in (|15j) . 



x* £ n> c Eu . 



(14) 



n c e ± 



u 

D 6 



c E l{ 



(15) 



The point sets E(f2) have been extensively studied in the quasicrystal literature, where they 
are used to model the locations of atoms in alloys [111 110] . 



5.2 Scaling transformations and rotations via the projection picture 

The sets S(N) in (|10p are by construction finite dimensional subsets of the cut-and-project 
quasicrystals S(il) in (|14|) . i.e. one has the set inclusion S(N) C T,(Q) for suitably chosen 
areas ft. A canonical choice for f2 is the Voronoi domain of the root lattice of Dq projected 
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to E± ^5]. It corresponds to a rhombic triacontahedron centred about the origin with edge 
length l/y/2 and inradius R ~ 5.32. 

The coordinates of the vertices of our polyhedra that mark the locations of pentamers are 
provided in the Appendix. They can be used to check that this choice of CI is large enough 
to contain their counterparts in E± under the mapping *. Let Vj, j G {1, 2, 3}, denote these 
vertex sets for the polyhedra corresponding the large, medium and small particle, respectively, 
and let V? := {x* \ x £ Vj} be their counterpart in E±. Then one obtains the following radii 
for the shells in E± on which the sets V* are located: 



Since these are all smaller that the inradius R ~ 5.32 of the triacontahedron that corresponds 
to Cl, the vertex sets of the polyhedra are indeed contained in E(O). 

Following ^2] we call a transformation crystallographic if it can be expressed as an N- 
dimensional matrix with integral entries in an iV-dimensional vector space. All transfor- 
mations that map the vertex sets of our polyhedra on points in S(O) are by construction 
crystallographic in this sense. It is hence possible to determine the corresponding scaling 
transformations, rotations and combinations thereof via the projection picture. In particu- 
lar, they can be studied as transformations in E± and correspond to those transformations 
that map Vj onto points in tt^(Dq) C Cl. 

• Scaling transformations: 

All transformations that act as contractions in Cl, that is scalings by z = Z1+TZ2 £ Z[r] 
(cf. @) with modulus \z\ < 1 correspond to crystallographic scaling transformations. 
In particular, they map Vj on a set Vj := {zx* \ x* G Vj }■ Using z** = z this induces 
a scaling of Vj by z* € En, which acts as a stretching transformation since \z*\ > 1. 

Hence, every z £ Z[r] with \z\ < 1 induces a stretching transformation T z * on Vj in En. 
An example is z = 1 — r = r' which induces a scaling of the index set Vj by r. 

• Rotations: 

Rotations can again be studied as rotations in f2. Since the shells on which the sets 
Vj are located are entirely contained within the inradius of the triacontahedron that 
defines CI, any rotation in CI that maps V* on points in tt±(Dq) D CI again induces a 
transformation of Vj in tt^(Dq). 

The same holds for combinations of rotations and scaling transformations. In particular, 
also the polygrammal transformations that play an important role in ^2] can easily be studied 
via the projection formalism in this way. 

6 Validation against experimental results 

Our approach can be tested based on the following two criteria: the predictions for the ratios 
of the radii of the different species of particles which are fixed completely by our mathematical 




R « 3.77 
R ps 2.76 
R ps 1.9. 



R ps 4.16 



(16) 
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formalism, as well as the predictions concerning the locations of the inter-subunit bonds in 
the capsids that are implied by the structure of the polyhedra. 

6.1 Predictions concerning the relative radii of particles in different species 

Our construction fixes the radii of the different polyhedra with respect to each other, and there 
is hence only one free parameter, a say, that relates the geometries of all particles collectively 
with their biological counterparts. In particular, we obtain a radius of Rl = 3.1247 for the 
large particle, of Rm = 2.3199 for the medium particle and of Rs = 1.1756 (using the radius 
at the global five-fold vertices where the pentamers are located) for the small particle. The 
ratios of these are 

1.9734, 2.658. (17) 

Rs Rs 

The polymorphic assemblies of the major capsid protein of Simian Virus 40, which is a 
member of the family of Papovaviridae, has been studied in [20] • They report the occurrence 
of three different types of particles: a large particle with a diameter of approximately 40 to 
45 nm, a medium sized particle with a diameter of size 25 to 35 nm and a small particles 
with a diameter of approximately 20 nm. It has not been possible in that work to distinguish 
whether the medium sized particles correspond to the octahedral particles consisting of 24 
pentamers reported in |21| . The radii of these particles are, taking averages, approximately 

fljjf « 21.25nm , R M « 15nm , and Rg » lOnm. (18) 

Since these are approximate values, we test whether any combination with small devi- 
ations from these values is compatible with the ratios obtained theoretically in (|17|) . One 
finds that the combination 

Rf = 21.26nm, R M = 15.79nm, and i?f = 8nm (19) 

corresponds to the ratios in (|17|). The correspondence with the values in Q18|) is remarkably 
close. It implies that the ratio between large and medium size particles are in excellent 
agreement with the experimental results. Moreover, given that the exact value of the small 
sized shell is more difficult to determine experimentally than those of the larger ones, a 
deviation of 2nm from the experimentally measured average can be considered as an equally 
good agreement with experimental results. 

Based on (Tl9|) . it is possible to compute the scaling factor, that relates the geometry of 
the shells of radii Rl, Rm and Rs in the theoretical model collectively with the biological 
system, as 

R K 

a=-±&6, 80nm with A G {S, M, L} . (20) 
R\ 

6.2 Predictions concerning the bonding structure 

The polyhedra provide information on the locations of intersubunit bonds between the pro- 
teins in different pentamers in the capsid. Based on the tiling approach developed earlier 
[HI El El j the faces of the polyhedra encode the locations of these interactions as follows: 
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a face with three corners corresponding to centres of pentamers in the capsid represents a 
trimer interaction between the proteins located in these corner; respectively, a face with two 
such corners represents a dimer interaction between these two protein subunits. The bonding 
structure implied by the polyhedra of the medium and large species are illustrated in Fig. PH21 
where the locations of the bonds are shown schematically by spiral arms. 




Figure 12: The bonding structure of medium and large particles implied by the tiling ap- 
proach. 

The combination of dimer and trimer bonds predicted by our theory for the large shell in 
Fig.^Jon the right coincides with the experimental results for the locations of the C-terminal 
arm extensions observed in [221 ■ ^ would be interesting to validate also the predictions for 
the bonding structure of the other species experimentally. 



7 Conclusions 

The results in this paper provide answers to the two open mathematical questions stated in 
the conclusion of \1'2\ . 

Firstly, we have addressed the question of how the series of polyhedra that starts with the 
triacontahedron and contains all polyhedra corresponding to all-pentamer configurations, can 
be completed. For this, we have derived a mathematical method that combines a projection 
of the root lattice of Dq with an affine extension of the non-crystallographic Coxeter group 
H3 in order to determine finite subsets of generalised lattices that encode the vertices of 
these polyhedra. In this way, we have been able to establish the triacontahedral series that 
complements the family of polyhedra in Caspar-Klug Theory. 

Secondly, the mathematical formalism developed here provides a framework for the sys- 
tematic analysis of crystallographic scaling transformations and rotations for polyhedra with 
vertices in the triacontahedral series. These are important as they can be used to associate 
three-dimensional structures with the blueprints provided by the surface structures of the 
polyhedra as demonstrated in |12j for the case of Human Rhinovirus, which corresponds to 
the small particle in our tricontahedral series. In this respect, our results pave the way for an 
analysis of the interior organisation of all viral capsids corresponding to the triacontahedral 
scries. 
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From the biological point of view, our triacontahedral series closes an important gap 
because it complements the series of Caspar-Klug polyhedra. Similar as the in the case of 
the Caspar-Klug series, this paves the way for a broad spectrum of applications in virology. 
In particular, due to the fact that Papovaviridae contain cancer-causing viruses, a better 
understanding of their structure is highly desirable to assist the development of new anti- 
viral drugs. For example, based on the structure of the polyhedra it is possible to develop 
assembly models that explain how viral capsids self-assemble from their capsid proteins. The 
theory developed here provides a framework for the generalisation of earlier work [31131 to the 
case of simultaneous assembly of different species of particles. Such models are important in 
order to study how viral assembly can be misdirected as a means to interfere with the viral 
replication cycle. 
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Appendix 

In the Appendix, we provide the coordinates of the vertices of the polyhedra that correspond 
to the centres of pentamers. 

The vertices corresponding to the 60 pentamers off the 5-fold axes of the large particles 
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are located on a shell of radius R ~ 3.1247. Their coordinates are given in 1)21(1 . 



2t-t', 1 -2r',6-r) 
-(2r-r / ),l-2r / ,6-r) 
1 -2r',6-r, 2t-t') 
_(l_2r / ),-(6-r),2r-r / ) 
6-r, 2r-r / ,l-2r / ) 
-(6-t),-(2t-t , ),1-2t') 
r' + At, -1 - 4t',t) 
-(r / + 4r),-(-l-4r / ),r) 
-1 -At',t, r' + 4r) 
-(-l-4r , ),-r,r' + 4r) 
r,r' + 4r, -1 - At') 
-t, -( r ' + 4r),-l-4r / ) 
r-3r / ,l + 2r / ,2 + T) 
-(r-3r , ),-(l + 2r / ),2 + r) 
-(l + 2r'),-(2 + r),r-3r / ) 

1 + 2r / ,2 + T,r-3r / ) 

2 + r,-(r-3r / ),-(l + 2r / )) 
-(2 + T),r-3T',-(l + 2T , )) 
r + 2 + 2t, -(2t - 2t' - 1), -(-2 - r - 2r')) 
-(r' + 2 + 2r), 2r - 2r' - 1, -(-2 - r - 2r')) 
2r - 2r - 1, -2 - r - 2r , r + 2 + 2r) 
-(2r - 2t' - 1), -(-2 - r - 2t'),t' + 2 + 2t) 
-(-2 - r - 2r ), -(V + 2 + 2r), 2r - 2r' - 1) 
-2 - r - 2r', r' + 2 + 2r, 2r - 2r' - 1) 

(Vw) 

(-3,-tV) 

(-t',t',3) 

(r',-r',3) 

(-t',-3,-t') 

(r',3,-r') 
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2t-t , ,-(1-2t , ),-(6-t)) 
-(2t-t , ),1-2t , ,-(6-t)) 
1-2t',-(6-t),-(2t-t')) 
_(l_2r / ),6-r,-(2r-r / )) 
6-t,-(2t-t , ),-(1-2t / )) 
-(6-t),2t-t / ,-(1-2t / )) 
r' + 4r,-(-l-4r / ),-r) 
-(r' + 4r),-l -4t',-t) 
-1 -4t / ,-t,-(t / + 4t)) 
-(-1-4tV,-(t' + 4t)) 

r,-(r' + 4r),-(-l-4r')) 
-r,r / + 4r,-(-l-4r / )) 
t-3t',-(1 + 2t / ),-(2 + t)) 
-(r-3r'),l + 2r / ,-(2 + r)) 
-(l + 2r , ),2 + r,-(r-3r / )) 

1 + 2r',-(2 + r),-(r-3r / )) 

2 + T,r-3T / ,l + 2r') 
-(2 + r),-(r- 3r'),l + 2r') 
r' + 2 + 2r, 2t - 2r' - 1, -2 - r - 2r') 
-(r' + 2 + 2r), -(2r - 2r' - 1), -2 - r - 2r') 
2r - 2r' - 1, -(-2 - r - 2r'), -(r' + 2 + 2r)) 
-(2r - 2r' - 1), -2 - r - 2t',-(t' + 2 + 2r)) 
-(-2 - r - 2r'), r' + 2 + 2r, -(2r - 2r' - 1)) 
-2 - r - 2t', -(/ + 2 + 2r), -(2r - 2r' - 1)) 

(-3,r',-r') 
(-/,-/, -3) 
(r',r',-3) 
(-r',3,r') 
(t',-3,t') 

(21) 



The locations of the coordinates in (|21|) on a sphere are illustrated in Fig. ^3 via the 
3d-Grapher software. 




Figure 13: The 60 vertices in (|21|) corresponding to the centres of the pentamers off the 
global symmetry axes. 

The vertices corresponding to the centres of the remaining pentamers of the large particles 
point to the vertices of an icosahedron inscribed into a sphere of radius R = 5 x 0.7265 ~ 
3.6325. The coordinates are given in (|22jl . 

— (±(r — 3 — r'), 0, ±2t') and all cyclic permutations . (22) 



The vertices of the polyhedron corresponding to the centres of the pentamers in the 
capsids of the medium sized particles are located on a sphere of radius R ~ 2.3199. Their 
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coordinates are given in (|23j) . 



-1-T + 2t / ,1,-3-t)/2 

-3-t,1+t-2t',-1)/2 

-2^,0,3/ - 1 -t)/2 

t'-2,2t'-2t + 1,-t)/2 

r,r,2-r) 

r,-2 + r,r) 

2-t,-t,t) 

2 + t, 3r',2-r)/2 

3t',-2 + t,2 + t)/2 

l+^-S-r^-r + r')^ 

2-t + t / ,-1-t',3 + t)/2 

-2,1,-t') 



(-l,3 + r,-l - r + 2r')/2 

(3r' - 1 - r, 2r',0)/2 

(0,-3t' + 1 + r,-2r')/2 

(-2t' + 2t+1,t, r'-2)/2 

(r,-r,2-r) 

(-r,-2 + r,r) 

(2 - r, -r, -r) 

(-3t',-2 + t,2 + t)/2 

(2-t,-2-t,-3t')/2 

(3 + r,-2 + r-r / ,l + r , )/2 

(-r',2,-1) 

(-iy,-2). 



(23) 



The locations of the coordinates in (|23|) on a sphere are illustrated in Fig. El via the 
3d-Grapher software. 




Figure 14: The 24 vertices in (|23|) corresponding to the centres of the pentamers. 

The vertices of the triacontahedron that correspond to the centres of the pentamers of 
the small particles are located on a sphere of radius R ~ 1.1756 with coordinates given in 
(EH). 

(±1, 0, ±r') and all cyclic permutations . (24) 
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